APPROACHES TO ITERATIVE ALGORITHMS FOR SOLVING NONLINEAR EQUATIONS WITH AN APPLICATION IN TOMOGRAPHIC ABSORPTION SPECTROSCOPY

In this paper we propose an approach for solving systems of nonlinear equations without computing function derivatives. Motivated by the application area of tomographic absorption spectroscopy, which is a highly-nonlinear problem with variables coupling, we consider a situation where straightforward translation to a fixed point problem is not possible because the operators that represent the relevant systems of nonlinear equations are not self-mappings, i.e., they operate between spaces of different dimensions. To overcome this difficulty we suggest an “alternating common fixed points algorithm” that acts alternatingly on the different vector variables. This approach translates the original problem to a common fixed point problem for which iterative algorithms are abound and exhibits a viable alternative to translation to an optimization problem, which usually requires derivatives information. However, to apply any of these iterative algorithms requires to ascertain the conditions that appear in their convergence theorems. To circumvent the need to verify conditions for convergence, we propose and motivate a derivative-free algorithm that better suits the tomographic absorption spectroscopy problem at hand and is even further improved by applying to it the superiorization approach. This is presented along with experimental results that demonstrate our approach.


INTRODUCTION
Solving systems of nonlinear equations has long been and still is a fundamental problem in mathematics with countless real-world applications that demand efficient methods to accomplish the task, see, e.g., the books by Ortega and Rheinboldt [36] and by Rheinboldt [37].A Google Scholar search on this topic returns, not surprisingly, numerous entries.A central approach is based on transforming the systems of nonlinear equations to an optimization problem and using methods from that field, mainly methods for unconstrained minimization which are commonly geared toward convex functions, see, e.g., Boyd and Vandenberghe's book [4].
In practical situations it is, however, often the case that no derivatives of the functions that comprise the system of equations are known and even if derivatives exist, they are not calculable.This hinders the applicability of minimization methods and a variety of heuristics have been developed such as, for example, the simulated annealing method, consult, e.g., the information on the auto-generated ScienceDirect Webpage on this subject 1 .
An alternative route is to transform the system of nonlinear equations into an operator equation such that every solution to the system is a fixed point of the operator and vice versa.This approach hinges on the premise that the associated operator is a selfmapping from a space into itself.In this paper we investigate the situation when this does not hold and the operator maps one space into a different space.For this scenario we propose an alternating fixed point approach.Specifically, we define a suitable family of fixed point operators that allows constructing an alternating common fixed points algorithm that can handle the problem.
However, in order to guarantee the convergence of the algorithm, some quite restrictive conditions on these operators are required.In particular, the approach seems not to be practical for the real-world application in tomographic absorption spectroscopy in which we are interested here.
Facing this kind of problems and motivated by the alternating fixed point approach described above, we take a deeper look at the properties of the equations at hand and suggest to use them in a different way.In particular, we address the case in which the equations depend on two variables, and the dependence on one of them is linear.For this problem, we propose a derivative-free algorithm which also acts alternatingly on each of the variables but makes use of descent directions of the summands of the least squares problem associated to the system of equations.We call it the descent pairs algorithm (DPA).The iterative nature of our descent pairs algorithm enables us to introduce a priori conditions into its iterative process.We do this via the superiorization methodology.This methodology works by taking an iterative algorithm, investigating its perturbation resilience, and then, using proactively such permitted perturbations, it forces the perturbed algorithm to do something useful in addition to what it is originally designed to do.We present a numerical validation of the descent pairs algorithm, with and without superiorization, for a real-world application in tomographic absorption spectroscopy.In this field, problems as the one presented here have been widely studied and a great variety of algorithms have been employed to tackle them.Our experiments show that the approach proposed here yields results that compete well with those obtained by these methods under similar conditions.As a general comment we care to mention that common fixed point iterative methods and related problems are a field of vigorous research with many new directions and developments, see, e.g., [1,25,39] and Cegielski's book [9].
The paper is structured as follows.In Section 2, we briefly recall the well-known fixed point approach for tackling systems of nonlinear equations and present our alternating common fixed points algorithm adapted for the case in which the operator is not a self-mapping.In Section 3, the particular instance of the problem with the linear relation is tackled.We present our descent pairs algorithm and mathematically support the idea behind the algorithmic scheme.Section 4, contains a broad view of tomographic absorption spectroscopy theory and in the last subsection, we include experiments that successfully demonstrate the good performance of our descent pairs algorithm, with and without superiorization, in this field.

ALGORITHM
We are interested in solving a system of nonlinear equations as formulated next.
Problem 1.Let R M be the Euclidean M-dimensional space.Consider a family of functions The motivation to look at this problem comes from a real-world application in tomographic absorption spectroscopy (TAS), see, e.g., Dai et al. [19], that we discuss later in Section 4. In some real-world applications, including TAS, the following condition prevails.
Condition 1.No derivatives of the functions β k are known and even if they exist they are not calculable.
A common approach for solving systems of nonlinear equations in such a situation consists of translating the system into a fixed point problem (FPP), see, e.g., Combettes [18] and Combettes and Pesquet [17].
To illustrate this approach, consider a system of nonlinear equations where γ j : R M → R, for j = 1, 2, . . ., M, are given real-valued functions and c = (c j ) M j=1 ∈ R M is a given vector.Denoting by Γ : R M → R M the operator the system (2.2) becomes an operator equation Since Γ is a self-mapping, it is well-known how to translate the system (2.2) into a problem of finding a point in the set of fixed points Fix(T suitably defined operator T (see, e.g., Berinde's book [3,Chapter 8,page 179]).This is done by considering the operator T : R M → R M given by where Id : R M → R M is the identity operator, that is, Id(x) = x.Then, it is easy to see that for a point and one can solve the system (2.2) by solving the fixed point problem for the operator T .The difficulty in applying this fixed point approach to the problem posed in (2.1) lies in the fact that β k : R M × R M → R M are not self-mappings.To the best of our knowledge, the challenge of adapting the fixed point theory to this setting has not been attended before.Therefore, we create here an alternating common fixed point algorithm that applies the approach of fixed point theory alternatingly to each of the two vector variables x and y of the functions β k of (2.1).
The adaptation of the fixed point theory to our problem works as follows.For any pair x, y ∈ R M and for any k = 1, 2, . . .,W , define the operators Then B k : R M × R M → R M × R M are self-mappings and, for all k = 1, 2, . . .,W , Now use the technique of Equation (2.5) and define, for all k = 1, 2, . . .,W, the operators i.e., Then, for all k = 1, 2, . . .,W, and finding a solution to Problem 1 is equivalent to solving the common fixed point problem (CFPP) Fix(T k ). (2.12) There is an extensive literature of algorithms devoted to solving the CFPP in (2.12), see, e.g., Zaslavski's book [40].As an example, the algorithm presented here in Algorithm 1 is the cyclic version of the almost cyclic sequential algorithm (ACSA) for the common fixed point problem, in Censor and Segal [14, Algorithm 5], which is, in turn, a special case of the algorithm in Combettes [16,Algorithm 6.1].
For the operators defined in (2.10), Algorithm 1 leads to the proposed alternating common fixed points algorithm.It is the iterative process, that starts with an arbitrary x 0 y 0 ∈ R M × R M , and then, for all ℓ ≥ 0, updates according to However, the convergence theorem for ACSA, see, e.g., [14,Theorem 6], requires additional assumptions on the involved operators.They should be "directed operators" 2 such that, for all k, T k − Id should be closed at 0. An (uninteresting) example verifying these properties are the projection operators onto the diagonal subspace.
Algorithm 1: The Cyclic Sequential Algorithm (CSA) for the common fixed point problem The difficulty of having operators whose block coordinates are identically defined, as is the case for the operators in (2.10), and verifying the assumptions of the convergence theorem for ACSA, lead us to search for an alternative approach to handle the problem.In the next section we develop a different algorithmic scheme which is inspired by the iterative process (2.13).

THE DESCENT PAIRS ALGORITHM
In this section we propose and motivate a derivative-free algorithm for tackling Problem 1 in the case when the operators fulfill the following two assumptions.We denote the set of vectors whose components are all different from zero by Assumption 1.For every k ∈ {1, 2, . . .,W }, the operators β k depend linearly on the second variable and nonlinearly on the first variable, such that they can be expressed, in matrix form, as 2 Directed operators are nowadays called "cutters", see [9,Subsection 2.1.3].
where β k : R M → R M ̸ =0 and diag(u) denotes the diagonal matrix with the vector u along its diagonal.
For our second assumption, we need to introduce the following definitions.Definition 3.1.(i) Descent direction (see, e.g., [2,Definition 5.1]).Given a function g : R M → R which is differentiable at some vector x ∈ R M , then a vector v ∈ R M is called a descent direction for g at the point x if is a descent direction at the point x for the function g : R M → R defined as
In the tomographic absorption spectroscopy problem, modeled as Problem 1 and studied in subsection 4.3 below, Assumptions 1 and 2 are fulfilled.
The next lemma shows that, although Problem 1 is a system which depends on both variables x and y, the Assumption 1 on the operators β k implies that the variable vector y does not interfere with the suitability of the variable vector x for solving the system.Lemma 3.2.Consider a family of operators , for k ∈ {1, 2, . . .,W }, for which Assumption 1 holds.Then, a point x * ∈ R M belongs to a solution pair (x * , y * ) of the system of equations given by (2.1) if and only if x * is a solution of the system Proof.Let (x * , y * ) be a solution of the system given by (2.1).Then, by Assumption 1, for all k,t ∈ {1, 2, . . .,W }, and using elementary rules for matrix inversion and diagonal matrices, we have Thus, x * is a solution for the single vector variable system (3.8).□ In Algorithm 2 below, which we call the descent pairs algorithm, we present a method for tackling a system of the form (2.1) that obeys Assumptions 1 and 2. The motivation of the algorithm is as follows.In order to obtain a solution pair (x * , y * ) to the system (2.1) we generate two separate iterative sequences.The first sequence is denoted by {x ℓ } ∞ ℓ=0 and employs Lemma 3.2 to find x * as a solution to the system (3.8).To achieve this, we consider the least squares problem associated with (3.8), which is Lines 4 to 8 of the algorithm generate x ℓ+1 from x ℓ by sequentially updating x ℓ with constant step-size line searches for each of the summands in (3.10), in the descent direction given by the descent pair provided by Assumption 2.
The purpose of the second sequence y ℓ ∞ ℓ=0 is to find the point y * .Lines 9 to 13 show how to obtain y ℓ+1 from y ℓ and x ℓ+1 .Indeed, y ℓ is sequentially updated by performing a constant step-size line search for each of the functions in the descent direction determined by the descent pair given by Assumption 2.
1 Initialization.Choose x 0 , y 0 ∈ R M .Set an index t ∈ {1, 2, . . .,W } in compliance with Assumption 2 and pick real fixed relaxation parameters λ x > 0 and λ y > 0; for q = 1, 2, . . .,W do y ℓ,q = y ℓ,q−1 + λ y b q − β q (x ℓ+1 , y ℓ,q−1 ) ; end Set y ℓ+1 = y ℓ,W ; In Section 4, we illustrate the good performance of the proposed Algorithm 2 in a demonstrative numerical experiment arising from a real-world problem in tomographic absorption spectroscopy, in which our algorithm shows a competitive potential vis-à-vis with state-of-the-art methods in the field.

AN EXPERIMENTAL DEMONSTRATION IN TOMOGRAPHIC ABSORPTION SPECTROSCOPY
4.1.Introduction to tomographic absorption spectroscopy.Absorption spectroscopy is a popular technique for gas sensing which can simultaneously retrieve thermophysical properties such as temperature, species concentration and pressure [7].When a laser beam penetrates a region of interest (ROI) filled with gaseous medium, its intensity is attenuated due to the absorption of the gas molecules along the line-of-sight (LOS).
The aim of absorption spectroscopy is to obtain information of the gaseous medium by measuring spectrum-specified absorbance.The Beer-Lambert law, which relates the attenuation of light to the properties of the material through which the light is traveling, provides the relationship between the gas properties (i.e., pressure, temperature and concentration) and the absorbance, see [33], b(ν) := ln(I 0 (ν)/I t (ν)) = ˆPS(ν, x) y dL. (4.1) The absorbance b(ν) is the logarithmic ratio of the incident I 0 (ν) and transmitted I t (ν) intensities for the absorption line at wavelength ν.In (4.1), P is the pressure, which is supposed to be a known constant, and S is the line strength function, whose value depends on the wavelength ν and the temperature x.For each wavelength of interest, the line function S has an approximately negative exponent relation with the reciprocal of the temperature, i.e. 1/x.The concentration of the absorbing species is y and L is the length of the LOS.
In practical applications, the gas properties are usually non-uniform along the LOS, therefore, tomography is needed to enable the spatial resolution of absorption spectroscopy.In order to represent a mathematically tractable model, full discretization is done, meaning that the light spectrum is discretized into a finite number of wavelengths, the ROI is discretized into a finite number of pixels/voxels and the external light sources are discretized into a finite number of individual beams.This modeling of tomographic absorption spectroscopy (TAS) is illustrated in Figure 1.
where b k i is the absorbance along the i-th beam with length L i at the k-th wavelength ν k , and α ν k is the absorption coefficient, related to the wavelength ν k and the local values of x and y.
Finally, we assume that the ROI is discretized and consists of a two-dimensional square, meshed into M = n × n square pixels, indexed by j = 1, 2, . . ., M. Thus, denoting by L i, j the length of intersection of the i-th light beam within the j-th pixel, equation (4.2) is reinterpreted in its fully-discretized form as for k = 1, 2, . . .,W, i = 1, 2, . . ., N, j = 1, 2, . . ., M. The α k : R × R → R are nonlinear operators which measure the absorption coefficient at the k-th wavelength.Note that after the discretization, x and y are redefined as vectors in R M .
Repeating equation (4.3) for all the beams at the k-th wavelength, a set of equations is formulated in matrix form as ) for all k = 1, 2, . . .,W , where L denotes the matrix L = (L i, j ) N,M i=1, j=1 .Thus, the fullydiscretized modeling of the tomographic problem in TAS results in solving the nonlinear system of equations where the vectors b k are known from measurements for all wavelengths and the matrix L is calculated according to the beam arrangement.In the past decades, numerous studies have made progress in this problem.Next, we give a brief glance of it.

4.2.
Techniques for solving TAS problems.Both nonlinear or linear approaches have been applied in the literature to solve system (4.5).Depending on the approach for solving the tomographic problem, TAS can be divided into nonlinear TAS and linear TAS.
In nonlinear TAS, all the equation systems are considered together and the inversion is recast into a one-step optimization problem as [30]: The x and y distributions are directly solved from (4.6).This optimization problem is usually solved by a global heuristic optimization algorithm, such as simulated annealing (SA) [6,30].
The SA algorithm was first proposed by Kirkpatrick et al. in 1983 [31] and has been widely employed in large-scale optimization problems [32,8].It is a heuristic for finding the global minimum inspired by annealing in metallurgy.A prominent advantage of SA is its insensitivity to the initial guesses [34], while the major drawback is its high computational cost.Besides, a priori information can be taken into consideration [30], which is another advantage of SA.
On the other hand, in linear TAS the problem is divided into two stages.In the first stage, for every k ∈ {1, 2, . . .,W }, equation (4.4) is solved for each individual wavelength as a linear equation system whose variables are the local absorption coefficients α k .Classical algorithms, including algebraic reconstruction technique (ART) [26], maximum likelihood expectation maximization (MLEM) [22] and Tikhonov reconstruction [20], have been extensively adopted for this stage.
For the second stage, absorption coefficients a k = (a k j ) M j=1 ∈ R M , at each pixel j and for each wavelength k, are supposed to have been recovered in the first stage, i.e., . . .
Now the properties x and y need to be solved according to the nonlinear relationship between them and the absorption coefficients.While, for every k ∈ {1, 2, . . .,W }, the absorption coefficients a k have been recovered in the first stage, the α k : R × R → R become here operators.Commonly, in the literature, gas properties within each pixel were calculated from the a k s using a nonlinear fitting (NF) method [27].This nonlinear fitting process employs a trust-region-reflective algorithm [15] to find the least-squares of the discrepancy between a k and α k (x, y).The iterative process of this algorithm is briefly described in Algorithm 3, see [35].In the algorithm, g and H refer to the first and second order derivatives of the function to be solved, respectively, and s = (s 1 , s 2 ) is the step of the iteration.It should be noted that an analytical expression of the nonlinear tomographic absorption spectroscopy problem is not derivable.Here, the g and H are results of an approximation of the nonlinear tomography formulation.
Algorithm 3: The Nonlinear Fitting (NF) method applied in multi-spectral TAS.
1 Initialization: Set δ ∈ R. Choose x 0 j ≥ 0 and y 0 j ≥ 0; 9 adjust the trust region size δ ; else reduce the trust region size δ ; end A drawback of this method is that it still has a high computational cost.Besides, since the algorithm obtains x j and y j pixel by pixel, a priori information cannot be employed in this algorithm for the process of obtaining temperature and concentration results from absorption coefficients.That means, in traditional approaches for TAS, a priori information is solely employed in the process to solve the absorption coefficients, which is sometimes defective.For example, if the temperature and the concentration satisfy different a priori information, the method to add a priori information on absorption coefficients cannot work.In addition, the calculation of temperature and mole fraction from absorption coefficient can introduce extra errors due to complicated error propagation.In the following sections, we motivate the implementation of our proposed Algorithm 2 for tackling problems of the form given by (4.7), and present a demonstrative example in which it outperforms the NF approach.

Implementation of the descent pairs algorithm. Let R M
++ denote the positive orthant of R M .For every k ∈ {1, 2, . . .,W }, denote by ++ the operator defined component-wise as where α k : R ++ × R ++ → R ++ are the operators in (4.7).This yields the system of equations It is a known property of the absorption coefficients that the operators α k fulfill Assumption 1 [7].Thus, there exist operators In order to validate the implementation of the descent pairs algorithm (Algorithm 2) to this problem, we employ a property of the TAS system of equations (4.9) which is known from the physics of the problem and is presented in the next remark.
Remark 2. For a fixed index t ∈ {1, 2, . . .,W }, define for each k ∈ {1, 2, . . .,W } \ {t} the functions Then, it is known from the physics of the problem that the functions f k are given component-wise by where S k , S j > 0 and E k , E t ∈ R are constants which depend on the wavelength, and T 0 is a positive known constant.
The following lemmata assure that Assumption 2 holds for the system of equations (4.9).Lemma 4.1.Let t be an index such that E t = min {E 1 , E 2 , . . ., E W } and let x ∈ R M ++ be a fixed arbitrary vector.Then, for any k ∈ {1, 2, . . .,W } \ {t}, the vector given by is a descent direction at the point x for the function g k : R M ++ → R + defined as Proof.The partial derivative of the j-th component of f k is given by Therefore, the gradient of g k at the point x ∈ R M ++ is given by Hence, we have since S k , S t > 0, E k − E t > 0 for all k and all diagonal elements of the matrix are positive.□ Lemma 4.2.Let x, ȳ ∈ R M ++ .The vector w k given by is a descent direction at the point ȳ for the function h k : R M ++ → R + defined by Proof.The gradient of h k at the point ȳ is given by Thus, we have where the last strict inequality holds since The above analysis shows that the system of equations (4.9) fulfills Assumptions 1 and 2, making Algorithm 2 a good choice for handling it.In Section 4.5 we provide a numerical demonstration of its good performance.

4.4.
Improving the descent pairs algorithm by superiorization.The iterative nature of Algorithm 2 enables us to introduce a priori conditions into the iterative process.We do this via the superiorization methodology.4.4.1.The superiorization methodology.The superiorization methodology [29] works by taking an iterative algorithm, investigating its perturbation resilience, and then, using proactively such permitted perturbations, it forces the perturbed algorithm to do something useful in addition to what it was originally designed to do.The original unperturbed algorithm is called the basic algorithm and the perturbed algorithm is called the superiorized version of the basic algorithm.
When the basic algorithm is computationally efficient and useful in terms of the application at hand, and the perturbations are simple and not expensive to calculate, then the advantage of this methodology is that, for essentially the computational cost of the basic algorithm, we are able to get something more by steering its iterates according to the perturbations.A detailed description of the superiorization methodology along with pertinent up-to-date references can be found in several papers, see, e.g., [12], [28] This general principle has been successfully used in a variety of important practical applications, see the recent papers in the, compiled and continuously updated, bibliography of scientific publications on the superiorization methodology and perturbation resilience of algorithms [10], where many applications oriented works with the method appear, e.g., [23].
In the language of [11], the superiorization methodology allows us to employ a given function, called target function, ϕ : R M → R. It interlaces into the iterations of a basic algorithm steps that perform locally reductions of the target function, these steps are called perturbations.The resulting superiorized version of the basic algorithm is expected to retain the convergence properties of the basic (unperturbed) algorithm but, additionally, steer the process to an output with reduced, not necessarily minimized, value of the target function.
We use this methodology to superiorize the descent pairs algorithm (Algorithm 2), as we describe below.4.4.2.Implementation of the superiorized version of the descent pairs algorithm.In our implementations, priors are applied to Algorithm 2 according to the superiorization methodology.Corresponding target functions ϕ, mentioned above, to lead the function reduction perturbations in the superiorization process are the priors defined by [38].These are the total variation (TV) function and the smoothness (Tikhonov, Tik for short) function [20], defined, respectively, by and where Z represents one of the properties x or y, {(i, j)} n,n i=1, j=1 is a set of pairs of indices of pixels in an exact grid in the ROI (see Figure 1), rn is the number of pixels that surround pixel (i, j), and (ii, j j) are the indices of these pixels.The pseudo-code of this method applied to (4.9) is shown in Algorithm 4. In the experimental runs the function ϕ in the algorithm will be either ψ TV (Z) or ψ Tik (Z) of (4.22) or (4.23), respectively.Although in the experiments only the above two functions are chosen as target functions, and thus gradients are involved in the pseudo-code below, it should be emphasized that in general any target function can be considered and function reduction is not necessarily via derivatives [12,13,24].To avoid zero as the divisor when taking gradient for the TV function, we add a small constant 10 −5 under the square root sign in our application.
In this algorithm, λ is the step length of the DPA iteration and η is the step length for superiorization, which is contracted according to γ as the iteration processes.The value of γ and the initial value of η influence the impact of the superiorization together.We denote by v the normalized gradient of the function ϕ at the current iteration and z is an intermediate variable between the superiorization and the DPA iteration.The square ROI was divided into 40×40 pixels grids and 160 beams were employed, which were distributed uniformly and parallelly in four directions, 0 • , 45 • , 90 • and 135 • .Ten discrete wavelengths were selected from the H 2 O absorption spectrum.
In order to observe the progress and behavior of the DPA algorithm (Algorithm 2) for solving the second stage of the multi-spectral TAS problem, as explained in Subsection 4.2 above, we compared its performance with the NF algorithm (Algorithm 3).In our implementation with MATLAB, the nonlinear least-squares solver, i.e., the function "lsqnonlin", was applied as the NF algorithm, wherein MATLAB's trust-region algorithm was employed in this function to find the solution.For the solution of the first stage we employed the superiorized ART algorithm [5,21].
Algorithm 4: The Superiorized version of the descent pairs algorithm (Algorithm 2), nicknamed SUP-DPA, applied in multi-spectral TAS.
For the SUP-DPA algorithm (Algorithm 4), in the reconstruction of Phantom 1, TV was employed as the prior, i.e., as the function ϕ in the algorithm.The SUP-DPA algorithm was initialized with λ x = 1000, λ y = 2, β x = 5 × 10 6 , β y = 10 and γ = 0.999.Initial guesses of x 0 and y 0 were vectors randomly chosen in the intervals described above for the DPA algorithm.
For the reconstruction of Phantom 2 with the SUP-DPA algorithm, smoothness was regarded as the prior, thus, the function ϕ in the algorithm was chosen as ψ Tik of (4.23) and the algorithm was initialized with λ x = 1000, λ y = 2, β x = 5 × 10 4 , β y = 10 and γ = 0.999.Initial guesses of x 0 and y 0 were vectors randomly picked in the intervals described above for the DPA algorithm.
All runs of the DPA and the SUP-DPA algorithms were stopped when either the number of iterations exceeded 50 or when ∑ W k=1 ∥b k − β k (x ℓ+1 , y ℓ+1 )∥ < 10 where the subscripts "rec" and "act" stand for reconstructed value and actual value, respectively.
It can be seen that after the stopping criterion function value fell below 10 −3 , the relative error hardly changes, indicating that a convergence is reached.The relative error of SUP-DPA had a slight fluctuation at the left end of the curve because of the perturbation introduced.On the other hand, 50 iterations is a large enough number for this testing case to assume convergence when the stopping criterion function value cannot decrease to 10 −3 .For the NF algorithm, the function tolerance and optimality tolerance were both set to 5 × 10 −10 .When implementing the NF algorithm, δ was initially set as the 2-norm of the difference between the upper and lower bounds, ((800, 2400) for x and (0.005, 0.2) for y), and the adjustment of δ followed the default settings in the MATLAB function "lsqnonlin".The algorithm was terminated when δ < 5 × 10 −10 .FIGURE 2. Relation between relative error and stopping criterion as the iterations proceed.
Results are shown in Figures 3 and 4 and Table 1.The name of the algorithms is shown at the top of the corresponding reconstructed profiles.The relative reconstruction errors are also labeled, which are defined by (4.25).To be specific, the unit of temperature x was K, which is labeled beneath its color-bar in the figures, and for the mole fraction y, the unit is one, thus, no unit was labeled.Figure 3 shows Phantom 1 and the results recovered by the DPA, the SUP-DPA and the NF algorithms.Temperature and concentration profiles generated by the three algorithms show similar shapes to those of Phantom 1, although there are some defects at the edges of the profiles.The reconstruction errors of SUP-DPA is smaller than those of the DPA and the NF algorithm.Besides, it takes much less time for the DPA and the SUP-DPA algorithms to finish the reconstruction than it took for the NF algorithm.Overall, the SUP-DPA algorithm can be regarded as the best among the three algorithms, for the particular experiments that we conducted and report here.
Figure 4 shows Phantom 2 and the results recovered by the DPA, the SUP-DPA and the NF algorithms.Results by the three methods show a similar shape to Phantom 2. The error of the DPA algorithm is slightly smaller than that of the NF algorithm, while the SUP-DPA algorithm improves the error of both other algorithms.Besides, the computational time of the SUP-DPA algorithm is much smaller than that of the NF algorithm.Though the DPA algorithm has the lowest computation time, its reconstruction effect is worse than that of the SUP-DPA algorithm.Therefore, the SUP-DPA algorithm can still be regarded advantageous over the DPA and the NF algorithms.
To further investigate the performance of the algorithms, simulation studies were conducted under different spatial resolutions.For Phantom 2, reconstructions were conducted when it was divided into 20 × 20, 40 × 40, 60 × 60 and 80 × 80 pixel grids, denoted as "gridding scale" 20, 40, 60 and 80, respectively.To ensure the comparability of the results, the measurement of each case was conducted in four directions, while the number of beams in each direction increased correspondingly to the gridding scale, which means that for a G × G grid, 4 × G measurement beams were applied.Other conditions, including noise level, parameters and stopping criteria, were the same as those described in Figure 4.The results are shown in Figure 5.
As the pixel grid was made finer, the computation time grew exponentially, indicating an increasing computational efficiency improvement of DPA and SUP-DPA compared with NF.In each simulation condition, DPA and SUP-DPA had better performance than

FIGURE 1 .
FIGURE 1. Illustration of the fully-discretized model for tomographic absorption spectroscopy.

4. 5 .
A numerical experiment.In this subsection, Algorithms 2, 3 and 4 are nicknamed as Algorithm DPA, NF and SUP-DPA, respectively.We conducted simulations on numerous temperature and concentration phantoms, two of which are shown here as demonstrative examples.Phantom 1 mimics the temperature and H 2 O concentration in a two-dimensional McKenna flame.Phantom 2 imitates smooth temperature and concentration distributions consisting of two Gaussian peaks.

 2 , 2 ,
−3 .Figure 2 shows the relation between relative error in x and the stopping criterion function ∑ W k=1 ∥b k − β k (x ℓ+1 , y ℓ+1 )∥ as the iterations proceed for DPA and SUP-DPA.The relative errors are defined as   Error x := ∥x rec −x act ∥ 2 ∥x act ∥ Error y := ∥y rec −y act ∥ 2 ∥y act ∥

FIGURE 3 .
FIGURE 3. (a) Temperature profile of Phantom 1. Temperature profiles recovered by (b) the DPA algorithm, (c) the SUP-DPA algorithm, and (d) the NF algorithm.(e) Concentration profile of Phantom 1. Concentration profiles recovered by (f) the DPA algorithm, (g) the SUP-DPA algorithm, and (h) the NF algorithm.

FIGURE 4 .
FIGURE 4. (a) Temperature profile of Phantom 2. Temperature profiles recovered by (b) the DPA algorithm, (c) the SUP-DPA algorithm, and (d) the NF algorithm.(e) Concentration profile of Phantom 2. Concentration profiles recovered by (f) the DPA algorithm, (g) the SUP-DPA algorithm, and (h) the NF algorithm.

FIGURE 5 .
FIGURE 5. (a) Computation time, (b) reconstruction error for x, and (c) reconstruction error for y, with respect to the gridding scale.

TABLE 1 .
Computation times of the algorithms Computation time (sec) DPA SUP-DPA NF